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Foreword 



id , 



This Technical Specification has been produced by the 3 Generation Partnership Project (3GPP). 

The present document specifies the codec specific RTP protocol details applying to packet switched conversational 
multimedia applications within the 3GPP IM Subsystem. 

The contents of the present document are subject to continuing work within the TSG and may change following formal 
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an 
identifying change of release date and an increase in version number as follows: 

Version x.y.z 

where: 

x the first digit: 

1 presented to TSG for information; 

2 presented to TSG for approval; 

3 or greater indicates TSG approved document under change control. 

y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, 
updates, etc. 

z the third digit is incremented when editorial only changes have been incorporated in the document. 



Introduction 



The present document contains a specification for required protocol usage within 3GPP specified Conversational Packet 
Switched Multimedia Services [5] which is based IP Multimedia Subsystem (IM Subsystem). IM Subsystem as a 
subsystem includes specifically the conversational IP multimedia services, whose service architecture, call control and 
media capability control procedures have been defined in 3GPP TS 24.229 [7], and are based on the 3GPP adopted 
version of IETF Session Initiated Protocol (SIP) [1]. 

In conversational packet switched multimedia service depends on IM Subsystem. The individual media types are 
independently encoded and packetized to appropriate separate Real Time Protocol (RTP) packets. These packets are 
then transported end-to-end inside UDP datagrams over real-time IP connections that have been negotiated and opened 
between the terminals during the SIP call as specified in 3GPP TS 24.229 [7]. 

The UEs operating within IM Subsystem need to provide encoding/decoding of the derived codecs, and perform 
corresponding packetization/depacketization functions. Logical bound between the media streams is handled in the SIP 
session layer, and inter-media synchronization in the receiver is handled with the use of RTP time stamps. 
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Scope 



The present document introduces the required protocols for packet switched conversational multimedia applications 
within 3GPP IP Multimedia Subsystem. Visual and sound communications are specifically addressed. The intended 
applications are assumed to require low-delay, real-time functionality. 

The present document describes the required protocol related elements for 3G PS multimedia terminal: 

• required SDP signalling regarding the media type bit rate, packet size, packet transport frequency; 

• usage of RTP payload for media types; 

• bandwidth adaptation; 

• QoS negotiation. 

The present document is applicable, but not limited, to packet switched video telephony. 
The applicability of the present document to GERAN is FFS. 
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3 Definitions and abbreviations 

3.1 Definitions 

For the purposes of the present document, the following term and definition applies: 

3G PS multimedia terminal: terminal based on IETF SIP/SDP internet standards modified by 3GPP for purposes of 
3GPP IM Subsystem services 

3.2 Abbreviations 

For the purposes of the present document, the following abbreviations apply: 

AMR Adaptive MultiRate codec 

IETF Internet Engineering Task Force 

IM Subsystem Internet protocol Multimedia Subsystem 

ITU-T International Telecommunications Union-Telecommunications 

RFC IETF Request For Comments 

RTPCP RTP Control Protocol 

RTP Real-time Transport Protocol 

SDP Session Description Protocol 

SIP Session Initiation Protocol 
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General 



3G PS multimedia terminals provide real-time video, audio, or data, in any combination, including none, over 3GPP IM 
Subsystem. Terminals are based on IETF defined multimedia protocols SIP, SDP, RTP and RTCP. Communication 
may be either 1-way or 2-way. Such terminals may be part of a portable device or integrated into an automobile or other 
non-fixed location device. They may also be fixed, stand-alone devices; for example, a video telephone or kiosk. 
Multimedia terminals may also be integrated into PCs and workstations. 

In addition, interoperation with other types of multimedia telephone terminals, such as 3G-324M may be possible, 
however in such case a media gateway functionality supporting 3G-324M - IM Subsystem interworking will be 
required within or outside the IM subsystem. 

Figure 1 presents the user plane protocol stack of a 3G PS conversational multimedia terminal explaining the transport 
of different media types and QoS reports. 



Conversational Multimedia Application 



Audio 



Video 



Text 



Payload formats 



RTP 




UDP 



IP 



Figure 1 - User plane protocol stack for 3G PS conversational multimedia terminal 



Media type requirements 



Media type RTP payload usage is specified in this clause. The media types and corresponding codecs are specified in 
3GPP TS 26.235 [5]. The continuous media type RTP payloads are mapped to RTP packets according to IETF RTP 
Profile for Audio and Video Conferences with Minimal Control in RFC 3551 [4]. 



5.1 



Audio 



5.1 .1 RTP session description parameters 

The IETF AMR and AMR-WB RTP payload format [19] offers different options. Here is the list of options and how 
they should be used by the transmitter. The receiver shall at least support the options as they are listed: 

the bandwidth efficient operation shall be used, 

only one speech frame shall be encapsulated in each RTP packet, 

the multi-channel session shall not be used, 

interleaving shall not be used, 

internal CRC shall not be used. 
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5.2 Video 

Video packets should not be large to allow better error resilience and to minimize the transmission delay in 
conversational service. The size of each packet shall be kept smaller than 512 bytes. 

5.3 Real time text 

Real time text media type RTP payload format for ITU-T Recommendation T.140 is specified in [27]. Redundant 
transmission provided by the RTP payload format is recommended in error prone channel. 



6 Call control 

Functional requirements for call control are specified in 3GPP TS 23.228 [8]. 

The required signalling functions are specified in 3GPP TS 24.228 [6] and call control protocols in 

3GPPTS 24.229 [7]. 

QoS authorization issues and interworking with the IM subsystem in general are covered in 3GPP TS 23.207 [10]. 



7 Bearer control 

The media control is based on declaration of terminal media capability sets in SDP part of appropriate SIP messages. 
The usage of bearer bandwidth can be effectively controlled by adjusting the media type encoder bit rates. 

7.1 Bandwidth 

The bandwidth information of each media type shall be carried in SDP messages in both session and media type level 
during codec negotiation, session establishment and resource reallocation. Note that for RTP based applications, 
'b=AS:' gives the RTP "session bandwidth" (including UDP/IP overhead) as defined in section 6.2 of [3]. 

The bandwidth for RTCP traffic shall be described using the "RS" and "RR" SDP bandwidth modifiers at media level, 
as specified by [28]. Therefore, a conversational multimedia terminal shall include the "b=RS:" and "b=RR:" fields in 
SDP, and shall be able to interpret them. There shall be a limit on the allowed RTCP bandwidth for a session signalled 
by the terminal. This limit is defined as follows: 

• 4000 bps for the RS field (at media level); 

• 3000 bps for the RR field (at media level). 



7.2 QoS negotiation 



The QoS architecture and concept is specified in 3GPP TS 23.107 [9]. The end-to-end QoS framework involving GPRS 
and UMTS is specified in 3GPP TS 23.207 [10]. The applicable general QoS mechanism and service description for the 
GPRS in GSM and UMTS is specified in 3GPP TS 23 .060 [ 1 1 ] . 

7.3 RTP receiver 

The RTP receiver implementation and functionality including lost and delayed packet processing as well as jitter buffer 
is out of scope of the present document. 
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Annex A (informative): 
Optional enhancements 

This annex is intended for informational purposes only. This is not an integral part of the present document. 



A.1 Video enhancements 

This clause gives informative recommendations for the video media type control. 

The SDP attributes regarding the video frame rate and the quality of media encoding should be used to ensure good 
video service. The recommended usage of these attributes are FFS. 

a=f ramerate : <f rame rate> describes the maximum video frame rate attribute in frames/second. Fractional 

values of <f rame rate> are allowed. 

a=quality : <quality> describes the quality of media encoding attribute, where the <quality> is a 

value in [0..10] with 10 indicating the best quality. 
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Annex B (informative): 

Mapping of SDP parameters to UMTS QoS parameters 

This clause gives recommendations for mapping of SDP parameters in UMTS QoS parameters for conversational 
multimedia applications. Different use cases will be considered. Each use case generates an example QoS profile 
parameters table table (with values for IPv4 and IPv6 addressing). The values indicated are derived by applications' 
QoS requirements, and may not be fulfilled by the network. In the parameters for guaranteed and maximum bit rates a 
granularity of 1 kbps is assumed for bearers up to 64 kbps, as defined in the TS 24.008. Therefore the "Ceiling" function 
is used for up-rounding fractional values, wherever needed. In addition, the same specification defines a granularity of 
10 bytes for the Maximum SDU sizes values. This is taken into account in the computation of this field in the QoS 
profile. 

Use case 1 - Voice over IP 

This use case includes the scenario in which two conversational multimedia terminals establish a bi-directional Voice 
over IP (VoIP) connection for speech communication, using the AMR or AMR-WB codecs with the same bit rate in 
both uplink and downlink directions. 

For example an AMR VoIP stream encoded at 12.2 kbps, with one speech frame encapsulated into an RTP packet, 
would yield IP packets of the following size (using the mandated bandwidth efficient mode): 

20 (IPv4) + 8 (UDP) + 12 (RTP) + 32 (AMR RTP payload) = 72 bytes, or 

40 (IPv6 with no extension headers) + 8 (UDP) + 12 (RTP) + 32 (AMR RTP payload) = 92 bytes. 



The gross bit rate including uncompressed RTP/UDP/IPv4 headers would be 28.8 kbps. The value in the b=AS media 
level parameter would be 29. The gross bit rate including uncompressed RTP/UDP/IPv6 headers would be 36.8 kbps. 
The value in the b=AS media level parameter would be 37. 

To determine the Maximum SDU size parameter we should consider the maximum packet size that can be generated 
with a speech codec. This is exactly that generated by a AMR-WB stream at 23.85 kbps packetized in bandwidth 
efficient mode and with 1 speech frame per packet. Considering uncompressed RTP/UDP/IPv6 headers, the maximum 
packet size is 121 bytes. 



The QoS profile would be set then using the following parameters: 
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Table B.1 : QoS profile for AMR VoIP at 12.2 kbps 



QoS parameter 


Parameter value 


Comment 


Delivery of erroneous SDUs 


No 




Delivery order 


No 


To minimize delay in 
the access stratum. 
The application 
should take care of 
eventual packet 
reordering 


Traffic class 


Conversational 




Maximum SDU size 


1 30 bytes 


10 bytes granularity. 
The RTCP packet 
size might change the 
maximum SDU size 
limitation [tbc] 


Guaranteed bitrate for 
downlink 


SDP media bw in DL + 

2.5% * (SDP media bw in DL+ SDP 

media bw in UL) = 

Ceil(30.45)=31 kbps (for the IPv4 case) 

Ceil(38.85)=39 kbps (for the IPv6 case) 




Maximum bit rate for downlink 


Ceil(30.45)=31 kbps (for the IPv4 case) 
Ceil(38.85)=39 kbps (for the IPv6 case) 




Guaranteed bitrate for uplink 


SDP media bw in UL + 

2.5% * (SDP media bw in UL+ SDP 

media bw in DL) = 

Ceil(30.45)=31 kbps (for the IPv4 case) 

Ceil(38.85)=39 kbps (for the IPv6 case) 




Maximum bit rate for uplink 


Ceil(30.45)=31 kbps (for the IPv4 case) 
Ceil(38.85)=39 kbps (for the IPv6 case) 




Residual BER 


10 s 


16bitCRC 


SDU error ratio 


7*1 0" 3 




Traffic handling priority 


Not used in Conversational traffic class 




Transfer delay 


100 ms 




SDU format information 


Not used 




Allocation/retention priority 


Subscribed allocation/retention priority 


Not relevant for the 
application 


Source statistics descriptor 


"Speech" 





In some cases, multiple AMR or AMR-WB rates are available, and rate control techniques allow to switch between 
different modes based on the received speech quality. For example, if the available AMR mode set is {4.75, 10.2, 12.2} 
kbps, the set of gross bit rates are: 



AMR 4.75 kbps: 21.6 kbps (including RTP/UDP/IPv4 headers). [SDP b=AS parameter would be 22]. 
AMR 10.2 kbps: 26.8 kbps (including RTP/UDP/IPv4 headers). [SDP b=AS parameter would be 27]. 
AMR 12.2 kbps: 28.8 kbps (including RTP/UDP/IPv4 headers). [SDP b=AS parameter would be 29]. 

In case of IPv6 addressing, the gross bit rates are: 

AMR 4.75 kbps: 29.6 kbps (including RTP/UDP/IPv6 headers). [SDP b=AS parameter would be 30]. 
AMR 10.2 kbps: 34.8 kbps (including RTP/UDP/IPv6 headers). [SDP b=AS parameter would be 35]. 
AMR 12.2 kbps: 36.8 kbps (including RTP/UDP/IPv6 headers). [SDP b=AS parameter would be 37]. 
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The maximum bit rate is set to the highest mode of the codec. However, the procedure on how to choose the 
guaranteed bit rate when several codec rates are available is to be defined. Here we provide an example QoS profile in 
which the guaranteed speech quality is at least that of 10.2 kbps AMR for both uplink and downlink directions, while 
the non-guaranteed maximum quality is that of 12.2 kbps for both uplink and downlink directions. 

Table B.2: QoS profile for AMR VoIP at 3 bit rates with rate control 



QoS parameter 


Parameter value 


Comment 


Delivery of erroneous SDUs 


No 




Delivery order 


No 


To minimize delay in 
the access stratum. 
The application 
should take care of 
eventual packet 
reordering 


Traffic class 


Conversational 




Maximum SDU size 


1 30 bytes 


10 bytes granularity. 
The RTCP packet 
size might change the 
maximum SDU size 
limitation [tbc] 


Guaranteed bitrate for 
downlink 


SDP media bw in DL + 

2.5% * (SDP media bw in DL+ SDP 

media bw in UL) = 

Ceil(28.35)=29 kbps (for the IPv4 case) 

Ceil(36.75)=37 kbps (for the IPv6 case) 


Guaranteed quality 
10.2 kbps 


Maximum bit rate for downlink 


SDP media bw in DL + 

2.5% * (SDP media bw in DL+ SDP 

media bw in UL) = 

Ceil(30.35)=31 kbps (for the IPv4 case) 

Ceil(38.85)=39 kbps (for the IPv6 case) 


Non-guaranteed 
quality 12.2 kbps 


Guaranteed bitrate for uplink 


SDP media bw in UL+ 

2.5% * (SDP media bw in UL+ SDP 

media bw in DL) = 

Ceil(28.35)=29 kbps (for the IPv4 case) 

Ceil(36.75)=37 kbps (for the IPv6 case) 


Guaranteed quality 
10.2 kbps 


Maximum bit rate for uplink 


SDP media bw in UL + 

2.5% * (SDP media bw in UL+ SDP 

media bw in DL) = 

Ceil(30.35)=31 kbps (for the IPv4 case) 

Ceil(38.85)=39 kbps (for the IPv6 case) 


Non-guaranteed 
quality 12.2 kbps 


Residual BER 


10 s 


16bitCRC 


SDU error ratio 


7*1 J 




Traffic handling priority 


Not used in Conversational traffic class 




Transfer delay 


100 ms 




SDU format information 


Not used 




Allocation/retention priority 


Subscribed allocation/retention priority 


Not relevant for the 
application 


Source statistics descriptor 


"Speech" 





Use case 2 - Unidirectional video 

This use case includes the scenario in which two conversational multimedia terminals establish a uni -directional video 
connection, using the H.263 or MPEG-4 codecs. 

The video codec in this example has a bitrate of 36 kbps, with RTP payload packets of 75 bytes (excluding payload 
header which is, for example, 2 bytes). The sending terminal would produce IP packets of the following size: 



20 (IPv4) + 8 (UDP) + 12 (RTP) + 77 (video RTP payload+payload header) = 117 bytes, or 
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40 (IPv6 with no extension headers) + 8 (UDP) + 12 (RTP) + 77 (video RTP payload+payload header) = 137 bytes. 



The gross bit rate including uncompressed RTP/UDP/IPv4 headers would be 56.2 kbps. The value in the b=AS media 
level parameter would be 57. The gross bit rate including uncompressed RTP/UDP/IPv6 headers would be 65.8 kbps. 
The value in the b=AS media level parameter would be 66. 

The maximum video packet size is limited to 512 bytes in section 5.2. This value is fine if transmission occurs over the 
UMTS Iu interface. However, in order to avoid SNDCP fragmentation of packets over the GERAN Gb interface (where 
the default size for LLC data field (=SNDCP frame) is 500 bytes) the maximum IP packet size is 500 - 4 
(unacknowledged mode SNDCP header) = 496 bytes. Therefore, the maximum size of a video packet is 496 - 60 
(RTP/UDP/IPv6 uncompressed headers) = 436 bytes (including RTP payload header). 400 bytes is a safer value. 



The QoS profile of the receiving terminal would be set then using the following parameters: 
Table B.3: QoS profile for unidirectional video at 36 kbps 



QoS parameter 


Parameter value 


Comment 


Delivery of erroneous SDUs 


No 




Delivery order 


No 


To minimize delay in 
the access stratum. 
The application 
should take care of 
eventual packet 
reordering 


Traffic class 


Conversational 




Maximum SDU size 


500 bytes 


10 bytes granularity 


Guaranteed bitrate for 
downlink 


SDP media bw in DL + 
2.5% * (SDP media bw in DL) = 
Ceil(58.43)=59 kbps (for the IPv4 case) 
Ceil(67.65)=68 kbps (for the IPv6 case) 




Maximum bit rate for downlink 


Equal or higher than guaranteed bit rate 




Guaranteed bitrate for uplink 


2.5% * (SDP media bw in DL) = 
Ceil(1.43)=2 kbps (for the IPv4 case) 
Ceil(1 .65)=2 kbps (for the IPv6 case) 


For RTCP 


Maximum bit rate for uplink 


Equal or higher than guaranteed bit rate 




Residual BER 


10~ 3 


16bitCRC 


SDU error ratio 


10" d 




Traffic handling priority 


Not used in Conversational traffic class 




Transfer delay 


250 ms 




SDU format information 


Not used 




Allocation/retention priority 


Subscribed allocation/retention priority 


Not relevant for the 
application 


Source statistics descriptor 


"Unknown" 





Use case 3 - Video telephony 

This use case includes the scenario in which two conversational multimedia terminals establish a bi-directional 
speech/video connection, using the AMR/ AMR- WB and H.263/MPEG-4 codecs at the same bit rates in uplink and 
downlink directions. 

The video codec in this case has a bitrate of 28 kbps, with RTP payload packets of 250 bytes (excluding payload header 
which is, for example, 2 bytes). The total video bit rate is 32.7 kbps (including RTP/UDP/IPv4 headers). The value in 
the b=AS media level parameter would be 33. For IPv6 addressing, the total video bit rate is 34.9 kbps (including 
RTP/UDP/IPv6 headers). The value in the b=AS media level parameter would be 35. 
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In the same bearer there is an AMR stream at 10.2 kbps with 1 frame encapsulated per RTP packet using the bandwidth 
efficient mode. The total voice bit rate is 26.8 kbps (including RTP/UDP/IPv4 headers). The value in the b=AS media 
level parameter would be 27. For IPv6 addressing, the total voice bit rate is 34.8 kbps (including RTP/UDP/IPv6 
headers). The value in the b=AS media level parameter would be 35. 

The total media bit rate is 28+10.2=38.2 kbps. The total session bit rate is 33+27=60 kbps kbps for IPv4 addressing, and 
35+35=70 kbps for IPv6 addressing. 

The terminal would produce IP packets of the following size: 

AMR: 20 (IPv4) + 8 (UDP) + 12 (RTP) + 27 (AMR RTP payload) = 67 bytes (or 87 bytes for IPv6 with no extension 
headers). 

Video: 20 (IPv4) + 8 (UDP) + 12 (RTP) + 252 (video RTP payload+payload header) = 292 bytes (or 312 bytes for IPv6 
with no extension headers). 

The same considerations done in Use Case 2 about the maximum packet sizes apply also for this use case. 

The QoS profile of the videotelephony terminal would be set then using the following parameters: 

Table B.4: QoS profile for videotelephony at 38.2 kbps 



QoS parameter 


Parameter value 


Comment 


Delivery of erroneous SDUs 


No 




Delivery order 


No 


To minimize delay in 
the access stratum. 
The application 
should take care of 
eventual packet 
reordering 


Traffic class 


Conversational 




Maximum SDU size 


500 bytes 


10 bytes granularity 


Guaranteed bitrate for 
downlink 


SDP media bw in DL for AMR + 

2.5% * (SDP media bw in DL for AMR+ 

SDP media bw in UL for AMR) + 

SDP media bw in DL for video + 

2.5% * (SDP media bw in DL for video+ 

SDP media bw in UL for video) 

= Ceil(63.0)=63 kbps (for the IPv4 case) 

= Ceil(73.3)=74 kbps (for the IPv6 case) 




Maximum bit rate for downlink 


Equal or higher than guaranteed bit rate 




Guaranteed bitrate for uplink 


SDP media bw in UL for AMR + 

2.5% * (SDP media bw in UL for AMR+ 

SDP media bw in DL for AMR) + 

SDP media bw in UL for video + 

2.5% * (SDP media bw in UL for video+ 

SDP media bw in DL for video) 

= Ceil(63.0)=63 kbps (for the IPv4 case) 

= Ceil(73.3)=74 kbps (for the IPv6 case) 




Maximum bit rate for uplink 


Equal or higher than guaranteed bit rate 




Residual BER 


10~ 3 


16bitCRC 


SDU error ratio 


10" d 




Traffic handling priority 


Not used in Conversational traffic class 




Transfer delay 


100 ms 




SDU format information 


Not used 




Allocation/retention priority 


Subscribed allocation/retention priority 


Not relevant for the 
application 


Source statistics descriptor 


"Unknown" 
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In case of usage of separate PDP contexts for the speech and video streams, the speech stream QoS profile parameters 
are set similarly to use case 1, while the video stream QoS profile parameters are set similarly to use case 2 (but 
considering that the video flow is bi-directional and considering possibly the same UMTS bearer transfer delay 
constraints for both media). 
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